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Abstract 



I— —I We focus on estimating the integrated covariance of log-price processes in the presence of market 

microstructure noise. We construct an efficient unbiased estimator for the quadratic covariation of 
j> two Ito processes in the case where high-frequency asynchronous discrete returns under market 

^ microstructure noise are observed. This estimator is based on synchronization and multi-scale 

^ methods and attains the optimal rate of convergence. A Monte Carlo study analyzes the finite sample 
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1 Introduction 



Estimating the quadratic covariation, also called integrated covariance, of asset returns is a central 
theme in finance. With the availability of high-frequency intraday returns the estimation of daily inte- 
grated covariances using high-frequency observations became an issue of great interest. The problems 
occurring in covariance estimation using high-frequency data are mainly the lack of synchronicity and 
market microstructure noise. 

In this article we propose a new estimator for the integrated covariance {X,Y)t of two log-price 
processes over a fixed time horizon [0, T] (usually one trading day) when we observe high-frequent 



noisy asynchronous data. The problem of asynchronous data without noise was solved by Hayashi and 
Yoshida ( |2005j ) and there are as well estimators developed in recent literature that solve the problem 
of noisy but synchronous data (see e. g. Bamdorff-Nielsen and Shephard (2004)). 
We work within the model where the efficient asset processes X and Y (without noise) are assumed 
to be Ito processes 



.X 



dXt = nf dt + dBl 
dYt = fil dt + aj dBj 



t 



e[o,T] 



with Brownian motions and B^ which are correlated with corr{B^ - , BJ) = pt and continuous, 
bounded and adapted stochastic processes fif , fij ,af , aj . It is well known that for synchronous ob- 
servations without noise the realized covariance X^t +i<t {Xu+i — Xt-) {Yt^_^_^ — Yt-) is a consistent 

estimator for {X, Y)t = Jq ptcrfaY dt if supj (ti+i — U) 0. 

In the case of high-frequency data, e. g. tick-by-tick data, the observations of the asset processes are 
usually not simultaneous and estimation methods are hence based on synchronization of the data by 
interpolation (e. g. linear or previous-tick interpolation) before calculating a realized covariance esti- 
mator. 

We use the construction of a synchronized time grid for two processes following the method pro- 
posed by [Palandri (2006j). The realized covariance calculated with the synchronized observations 
corresponds to the Hayashi- Yoshida estimator given by the sum of all products of increments with 
overlapping time intervals. This estimator is proved to be a consistent estimator in the absence of 
noise CHayashi and Yoshidal (12005 )) but becomes inconsistent when market microstructure effects are 



relevant. The behaviour of this estimator and the realized covariance depending on the sample fre- 
quencies are studied in Hansen et al. (2007 ) and Griffin and Oomen ( 2006| ). 

Market micostructure effects and estimators for the integrated volatility under its influence were stu- 
died intensively in recent literature. Estimators with an optimal rate of convergence where 
denotes the number of observations, are presented by Zhang] (2006a) and [Barndorff -Nielsen and She- 
phard ( 2006[ ). Merging those techniques for asynchronous data and the subsampling approach to high- 



frequency observations contaminated by market microstructure noise as presented in Zhang et al. 
(j2005j) a consistent estimator for asynchronous noisy data can be achieved. By construction of an ade- 
quate synchronized time-scale for two asset processes and subsampling an estimator with A^^/'*-rate 
of convergence can be obtained, where N denotes the number of synchronized observations in this 
context, which is less than or equal to the minimum of observations of X and Y. This result is pre- 



sented in Palandri (2006). We use the same methods to rearrange the observations in a synchronized 
grid and show how an extension of subsampling to a multi-scale approach can afford an estimator 
with a more efficient rate of convergence A^^/*, which we prove to be the best attainable rate. For 
this purpose we prove local asymptotic normality with rate N^^/* for a simplified model and obtain 
bounds for the asymptotic Fisher information. With the minimax theorem we conclude that N^^'^ is 



2 



a lower bound for the rate of convergence even in the synchronous equidistant case. Our estimator 
hence upgrades the estimator proposed in Palandri| ( 2006[ ) to a rate-optimal consistent estimator for 
integrated covariances and leads to a suitable implementation of integrated variance and covariance 
estimation following the multi-scale approach invented by [Zhang (2006a) and our extension for the 
covariance case. Bamdorff-Nielsen et al. ( 2008| ) present a multivariate realized kernel estimator that, 



furthermore, guarantees to be positive semi-definite, which is aside from non-synchronicity and noise 
a third important issue in multivariate considerations. Their estimator has a A^^/^-rate of convergence. 
In Section [2] we introduce the model and our basic notation. We present a brief outline and the two 
main results of this article concerning the asymptotics of our multi-scale estimator for the quadratic 
covariation and local asymptotic normality. 



2 Model and Main Results 

We want to obtain a consistent estimator for the covariation {X, Y)t of two Ito processes Xt and Yt 
over a fixed time interval [0, T], e. g. one trading day, when we observe discrete asynchronous returns 
contaminated by market microstructure noise. The observations of X will be denoted by 

Xt„ , . . . , Xt„ , ^<tQ<ti< ... <tn<T, with increments AX*^ = Xt^ - Xt^_^ , 

and the observations of another log-price process Y by 

y^o, . . . , , < To < n < . . . < T„ < T, with increments AF^^ = Yr^ - Yr^_^ . 

Synchronous data would mean that m = n and ti = Tj for alH G {0, . . . , n}. We consider the general 
case where the number of observations may differ and the sets of observation times O-*" = {to, • • • , tn} 
and 0^ = {to, . . . , Tm} also contain points U ^ 0^ and Tj ^ O"^. Usually the considered time interval 
[0, T\ starts with the first observation which means to = or tq = 0. We work within the model 
imposed by the following two assumptions: 

Assumption 1. The observed log-price processes are described by the sums of efficient ltd processes 
and independent (discrete-time) noise processes ef, and e^.: 

Xt^=Xt^+ef^, ie{0,...,n} 

Yr^=%+eX^ , iG{0,...,m}. 
On a filtered probability space (O, 9", (3"t)) the efficient processes are defined by 

dXt = fifdt + afdBf , 
dYr = fi^dT + o^dB^ , 

where and B^ are two 3^t-<^dapted correlated standard Brownian motions with correlation pt. The 
drift fit cmd spot volatility at for both are 9't-adapted, continuous and bounded stochastic processes. 

Assumption 2. The errors e^, i G {0, . . . , n} and e^. , j G {0, . . . , m} due to market microstructure 
noise are assumed to be i.i.d. processes and independent to each other and the efficient processes. We 
also assume 



Ee^ := Ee^ = Ee'^ := Ee^ = Vi, j 



and¥.{e^Y < oo where E {e^ f := E 



andE (e^) := E 



analogously. 
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The variances of the noise processes will be denoted by 



E 



and := E 



We want to obtain a consistent rate-optimal estimator for the integrated covariance of the efficient 
processes (X, Y)t = Jq pt(Jt^<^J dt under asymptotics where supj Atj and sup^- Atj 0. 
Assumptions concerning the observation times are imposed later on after constructing a synchronized 
joint grid. Evidently, we need a form of regularization criterion to ensure the number of observations 
for both processes and the length of (usually not equidistant) time intervals Atj = {ti — tj-i) 
and Atj = (rj — Tj-i) being of the same order. Recall that for our analysis we regard the 
conditional law given the observation times. A precise analysis for the case of random trading 
times (e. g. event times of counting processes) require some additional concepts that are not the 
focus of this article although we will use Poisson processes to generate the observation times in 
Section [7] The latter could be interesting when tick-by-tick data are considered, where trading ti- 
mes can be modeled as Poisson arrivals. We refer to |Zhang ( |2006b I for an analysis of this special case. 



At this point we present an outline and an outlook on the the two main results of this article. 
The article is organized as follows: in Section[3]we describe the method of synchronization and show 
that the Hayashi-Yoshida estimator becomes inconsistent in the presence of market microstructure 
noise. In Section |4] we present the subsampling estimator, which is similar to the one proposed by 
Palandri| ( |2006| ), but using a different representation, which will be useful for our further analysis. 
In Section [5] we develop our new estimator using a multi-scale approach that improves the rate of 
convergence to N^/'^ resulting in a more efficient estimation method for the quadratic covariation 
{X, Y)t- This is one of our main results and can be summarized in the following Theorem[l| 

Theorem 1. Let Assumtions^ ^and^ that will be stated in Section^ be satisfied. 

— (mult) 

We will prove that the multi-scale estimator {X, Y)^, , which will be constructed in Sections 
is unbiased and has the following asymptotic property: 



E 



~ mult \ 2 

{X,Y)t -{X,Y)t 



1/2 



(^N-^/^ 



N denotes the number of synchronized observations emanating from the synchronization method that 
will be presented in the next section. 

In Section|5] after the construction of our multi-scale estimator, Proposition|9]gives a more detailed 
version of Theorem [T] 

In Section [6] we consider a simplified parametric model with constant parameters and equidistantly 
and synchronously observed data contaminated with market microstructure noise. The key result is 
the following Theorem |2j 

Theorem 2. For a constant correlation coefficient p and synchronously, equidistantly observed Brow- 
nian Motions with i. i. d. Gaussian noise local asymptotic normality (LAN) holds with N~^^'^-rate. We 
conclude that N^^'^ is a lower bound for the rate of convergence for any sequence of estimators and 
hence our multi-scale estimator is rate-optimal and asymptotically efficient. 



In Proposition 1 1 we also give bounds for the asymptotic Fisher information. 
Simulation results follow in Section |7] where we compare the two proposed estimators and see that 
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the multi-scale estimator also performs better in the case of finite sample sizes. We observe that if 
the influence of market microstructure effects is incisive the multi-scale approach provides a much 
more efficient estimation compared to the estimator based on subsampling. For the simulations we 
generated the observation times as arrivals of Poisson processes. 



Dealing with asynchronicity: Synchronizing data and covariance esti- 
mation for observations without noise 



Hayashi and Yoshida ( 2005| l proved the consistency of their estimator 



_____ (^jjY) " ™ . . 

{X,Y)j. = AXf^Aiy^. l^min(f,,T,)>max(f,_i,T,_i)] , 

i=l j=l 

where the product terms include all increments over overlapping time intervals, for the case where 
we observe the efficient price processes without noise. 

In the case of high-frequency data contaminated by market microstructure noise, however, the 
estimator becomes inconsistent and explodes (tends to infinity) for min (n, m) —>■ oo. We will prove 
this in Proposition[T] but we focus first on an alternative useful method to deal with the asynchronicity 



of the data. This method was presented in PalandtT| ( |2006) (which he calls pseudo-aggregation) 



For this purpose we construct a set with one or more than one observation times as elements which 
we call the joint grid. Those elements will be denoted by "K^ and S*, i G {0, . . . , A^}, and {N + 1) 
is the number of sets contained in the joint grid. The resulting synchronous realized covariance 
estimator for the integrated covariance will coincide with the one of [Hayashi and Yoshida (2005 1 but 



this approach will be useful for our analysis of noise terms. In particular, this construction will enable 
us to deal with the noise contamination by applying subsampling techniques in Section [4] Based on 
this construction we get a joint grid with (A^ + 1) sets ^K* and S* where N < min (n, m). The last 
fact indicates heuristically that the efficiency of such techniques of covariance estimation depends on 
the number of observations available for the lower frequent process. In the ideal case both observation 
frequencies do not differ too much, i. e. n and m are of the same order and both assets show similar 
liquidity. In Assumption [3] we will give a more precise statement on the conditions imposed on the 
observation times needed for our analysis. 

The method of constructing a joint grid for the observations of both processes is described by 
the following iterative algorithm: 
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first step: 

• for to < To and wo '■= min {w € {1, . . . , n}\tw-i < tq < t^})'- 

?{° = {to,...,i«)o} and g° = {ro} 
' u;o + 1 if To = two 



qi := < and ri := 1 

I Wo if To < 



• for to = To: 



= {to} and g° = {ro} 
qi := 1 and ri := 1 

• for to > To and lo := min (l e {1,. . . , m}|r;_i < to < t;}): 

0<^ = {to} and gO = {ro,...,T^} 



gi := 1 and ri := 



/o + 1 if to = Tj^ 



lo if to < T-, 



lo 



i-th step (given Jf^-^ and S^"^): 

• for tq. < r^. and Wi := min (lu G {^j + 1, . . . , n}|t^_i < r^. < t^}): 

J£^ = {t5„...,t^J and 9' = {Tr,} 

J %+i = t«j + 1 if = twi 



. and n --^ ri+i = + 1 

[Qi+i = Wi if r^. < tu,i 



• fort„, = Tr,: 



W = {t,J and 9' = {Tr,} 
qi qi+i = qi + 1 and rj+i = r j + 1 

for tq. > T^. and [j := min {I e {n + 1, . . . , m}|rz_i < t^. < rj): 

?{^ = {t,J and g^ = {T,„...,T^} 



% ---> g'i+i = + 1 and rj — > 



rj+i = li + l if tg. = T^. 

Ji+l = k if tg. < T^. 



Let us give an example for the construction of tlie joint grid. Of course the example is just for 
illustration and the number of observations m = 6 and AT = n = 5 is restricted and much smaller 
than in practice. The example emphasizes some important issues appearing when observations are 
asynchronous. 
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Example 

n=5 



m=6 



In the example illustrated above we have "K^ = {to}; S*^ = {''"o},3^^ = {^1,^2}; = {ti},^K^ = 

{t2},S' = {T2,r3,r4},:>{3 = {t3},g=^ = {t^,T5],'K^ = {U} , 9^ = H},^' = {h} , 9' = {tq} - 

The example shows the important fact that the sets !K* and 9* are in general not disjoint and the 
maxima of consecutive sets can be the same time points. The minimum of a successive set can as well 
equal the maximum of the prevenient. For further examples see Palandri \2006 \. 

Next we pass over from the original observations to the sums of observed increments of the noisy 
log-prices over sets 3<* and S*, respectively: 

X^' := ^^t,, Y^':= ^AYr^, iG{0,...,N}. 
We observe that the realized covariance of the synchronized observations 

A' n m 

SRC := = AXt^Ay^^,l[inin(t„r,)>max(t,_i,T,_i)] 

i=0 i=l j=l 

is the well-known Hayashi-Yoshida estimator. We use a different representation of this estimator com- 
pared to [Palandn] ( |2006| ) using telescoping sums. If we define 

fii := max(/c|tfc G IK*), jli := max(A;|Tfc G S') and 

Vi := min {k\tk i>i := min (A;|rfc G S*) , i G {0, . . . , iV} 

and for the purpose of a simpler notation 

Xg^:=Xt^^, Y^.^=yr,^ ,iG{0,...,iV}and 

Xu :=^^-i, Yy^:=Y,,^_, ,iG{l,...,iV} 

with ^0 '■= toi -^0 := ''"0 we can write and Y^^ as telescoping sums X'^^ = (Xg. — Xi^), 
yS» = (Y^^ _ ) which leads to 

^— ~- {HY) ^ 

{X,Y)^ =Y,{X,,-Xu){Y,.-Yxd ■ 
i=0 



1 



In this notation gi denotes the greatest and k the last observation time before the least element of the 
set and analogously 7j and Aj of S*. 

We are interested in asymptotics when n, m ^ oo being of the same order and the time lags between 
returns tending to zero and hence impose the following assumption on the observation design: 

Assumption 3. For the time lags between the observations 



On 



sup {ti -ti-l) 
ie{l,...,n} 







N 



On 



holds. 



sup (Tj- 
j€{l,...,m} 







N 



By this assumption we exclude data where the lengths of the observed time intervals vary vigo- 
rously or the number of observations n and m are of different order. The assumption seems not too 
restrictive for most applications. Similar conditions are often imposed for asymptotic analysis, see 
e.g. Zhang (2006b I. Further results are deduced under asymptotics for n,m — > oo and Assumption 



jsj In the following we will use the notation Ej^ y 
given the paths of both efficient processes. 



E 



for the conditional expectation 



Proposition 1. Under Assumptions 7]2 and^^or the synchronized realized covariance estimator 



E 



X,Y 



{X,Y)^ 



{X,Y)t , Var 



(HY) 
T 



holds. 

Proof. Unbiasedness follows directly from Assumption [2] and the conditional variance can be simpli 
fied to: 



r-r- {HY)\ 



■ N 

E 



E 



N 



i=0 



i=0 



+ Op(l) 



+ Op(l) = 4iVr?^r/^ + 0^(1) = Op(iV) . 



The variances of (e^ — e^) [Y^^ — Yx^j and the second sum including increments of X and 
lead to the term of order 1 in propability. The mixed terms in the remaining second moment have an 
expectation equal to zero although consecutive sets and (or 9* and S*+^) are not generally 
disjoint. Nevertheless, our synchronization method was defined such that if the intersection of ' and 
is non-empty, S* n S*"''^ = holds. Assumption [2] yields that each summand has expectation 
2r?i • 2r,l. □ 

We conclude that in the presence of market microstructure frictions the Hayashi-Yoshida estimator 
cannot yield a consistent estimation of the integrated covariance of the underlying efficient processes. 
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4 Dealing with noise contamination: the subsample estimator 



In this section we show that a subsample approach as presented in Zhang et al. ( 2005 1 leads to a 
consistent integrated covaiiance estimator with N^/^-rate of convergence. The sets Jf* and S* are 
grouped in Kj\i subsamples and for each subsample the (lower-frequency) realized covariances are 
calculated. So the synchronized observations are arranged to subsets of observations and for each 
subset we can calculate a realized covaiiance for which the error due to noise is smaller than for the 
highest-frequency realized covariance because of Proposition [T] Averaging the realized covariations 
calculated with lower frequencies leads to the resulting estimator: 



i=KjV 

Recall that we use the synchronized data and the joint grid that we presented in Section[3]to calculate 
the subsampling estimator. K depends on N, but we drop the index in the following. The sum over 
K subsamples and the sum over all increments in each subsample can be simplified to the form of 
the estimator stated above if we assume a regular allocation to subsamples CK^, "K^ , ... in the first 
subsample, 'K^ ,'K^~^^ , ... in the second, etc. ). We do not consider boundary effects (we focus on 



asymptotics K = o(A^) and A', n oo) and hence leave out the weights imposed by Palandri 



( 2006[ l. His (realized) covariance estimator is defined as an average of weighted sums of products 



over overlapping increments also divided in subsamples. If we regard the unions 

wK+{v-l) wK+(v-l) 

^.,u= U and 5,,^= U 

i={w—l)K+v i={w—l)K+v 

of K sets ;K* and 9\ respectively, and X^^ui = Ylit i^A^ „ ^'^'^ ^''>«' ~ eB„ ™ ^^^j' r^^P^c- 

tively, his 'Consistent Realized Covariance' estimator 

^ K-l N/K-1 

CR-C = — ^ ^ ^ ^ Xv,wYv,w 

v=0 w=l 

corresponds to our proposed estimator (except the weights w^). Assuming that the noise processes 
across assets are independent, this estimator is unbiased and as we will prove in the following has for 
optimal choice K = 0(A^^/^) an asymptotic variance of order N~^/^. The unbiasedness is the reason 
why we do not need a bias-correction term in contrast to the realized variance case (two time-scales 



estimator presented by Zhang et al. (2005 1). 



Palandri] ( |2006| ) has proved this result for his similar estimator, but it is reasonable for our further 



analysis concerning the multi-scale estimator in chapter[5]to give a short calculation of the asymptotic 
variance of the subsample estimator in our illustration. We are only interested in the order of the 
asymptotic variance and we impose mild assumptions on the grid and time intervals that are inherent 
in the method of subsampling and the data. In particular, we assume a regular allocation to subsamples 
as stated above and Assumption [s] that ensures (together with Assumption [ij) AXj. = Op[^Jl/N), 

A% = Op(yi7iv)- 
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The total variance can be written as 



— — ~ sub 

Var ( (X,y)y 



N N 
i=K j=K 



N N 



F2 E E (^°^ (©- ®j) + (@- + 'C"^ (®- ®^) + (®- 



i=K j=K 

with the four uncorrelated terms: 



~ ^h-l 



(n). 



-Y Y 



We will consider the four summands consecutively. 

We start our analysis of the asymptotic orders of the different summands in the total variance focusing 
on the sum of covariances Cov ((e)j , ©j)- This is the variance due to discretization and would be 
the total variance of a subsampling estimator calculated with observations of the efficient processes 
without noise. First we deduce the order of increments for the efficient processes from Assumptions 
[I]and|3l 

Lemma 2. If Assumptions^and^hold, we obtain the following asymptotic orders for the efficient 
processes without micro structure noise: 

2" 



E 
E 



yt I'l — K 



E 



K I ( ~ y>'i-K 



E 



Xg^ - Xl^_-^ ) - Yx^.K 



{K/N) , 

{K/N) , 
(K/N) , 

{k'^/n'^) 



(3a) 

(3b) 
(3c) 
(3d) 



Proof. By Assumption[T]the drifts of the efficient processes are bounded and thus 

\ 2" 



E 



■ — - TV- 
yi I'l — K 



E 



9i 



of dBf 



+ 



holds because the squared drift term is of order K'^/N'^ and the mixed term is of order (K/N)^/^. 
Therefore, to prove ( [3a| ) it suffices to apply Ito isometry and the mean value theorem using again 
Assumption[T]for the spot volatilities: 

2" 

E 



E 



h- 



at dBi 



X 



h- 
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The asymptotic orders of the time increments are given by Assumption jsj The constant 

curing by application of the mean value theorem is finite because of Assumption [T] The proof of ( 3b i 

follows analogously. 

Using (3a]l and ([3b]l we obtain (|3c]l by the Cauchy-Schwarz inequality: 

E 



< WE 



• WE 



Ky. - Yx- 

It ^i — K 



The fourth moments of the increments can be bounded by the squared quadratic covariation using the 
Burkholder-Davis-Gundy inequality, so it is adequate to prove (|3d]) again using the Cauchy-Schwarz 
inequality: 



E 



Xfj — X}- 




<\c¥.[{X,X)g^-{X,X 



c*E (y,y)^,,-(y,y)A, 



with constants c and c* from the application of the Burkholder-Davis-Gundy inequality. The order 
K"^ /N"^ then easily follows e. g. using again the mean value theorem as above. □ 



Corollary 3. We obtain for the variance due to discretization, when there is no micro structure noise 
present, that we denote by Yarr]x=VY=o 

/ sub\ 

Y^^^=^,.=oi{X,Y)^ 



Proof. 



sub\ 1 ^ ^ 

(^'^)t ) = ]^ E E (©, , ©,) 



i=K j=K 



N i-l 

EE' 

i=K j=K 



N i-l 



v X„.-X, 



h-K I [ Y-yt Y\._j^ 



Xn-X, 



1 ^ 



i=K 



]^ E E l{l-.l<A'}Cov( [X^-X^^+X^-Xi^_^) (y,-%+Y,-Yx^^,, 

i=Kj=K 

X9-Xi^,,,+Xk-K-Xi,.^) (y,-Yx,^^+Yx,,k-Yx,-j,) ) + ^ E ^^'"(©^ 

i=K 

N i-l 1 ^ 

E E l{l-.l<A}Var {{X,-X,^_,) {%-Yx^_,)) + ^ E Var(©J 



2 




i=K j=K 



i=K 
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In this calculation we also used characteristics of our synchronization method. The increments 
(^Xg- — Xi-_j^^ and (^Xg. — Xi._^^ are non-overlapping and hence uncorrelated for \i — j\ > K. 

Taking the construction procedure of the joint grid into account, increments (j^g^ — Xi._j^^ and 

Y^j — Y\^_j^ j are uncorrelated as well for \i — j\ > K . □ 



(4) 



So, the asymptotic order of the discretization variance is 

N N 



1 

^EECov(©.,(e),.)=0^ 



i=K j=K 

For the mixed summands in Q we obtain under Assumption [T] [2] and [3] 

N N N N-1 

]^EECov(@.,@,) = ^EVar(@J + ^E' 

i=K j=K i=K i=K 



<]^EVar(@.) = C.| = 0(f) (5) 

i=i ^ ^ 

with a constant Cm and analogously with a constant d, 

^ N N 2 / 2 \ 

i2EECov(@.,@,)<C.f = 0(^) . (6) 

i=K j=K ^ ^ 

The fourth (noise) term in ([2]) is of order N j because 

N N N N-1 

-i, 5] ^ Cov (®, , (n)^.) = 7?2 E (®J + ^ E 



i=K j=K i=K i=K 



i=l ^ ^ 

with a constant Cn holds. We used Assumption |2] that the noise is i. i.d. , but recall that gi = gi^i 
and li = Qi^i is possible. The reason why we do not include moments of the distribution of the noise 
processes in the constants is that these distributions may depend on the number of observations 
although we disclaimed on further indices. 

For a choice K = 0(A^^/^) the first and fourth term are of the same order N^^/^ and thus we see that 
the (X, Y)rp -estimator is consistent with rate N^^^. 

We summarize the properties of the subsample estimator in the following proposition. 
Proposition 4. If we choose = (A^^/^) the subsample estimator 



N 



E 



- — — — - sub 1 / \ / 

i=Km 

is a consistent unbiased estimator with asymptotic variance of order N^^/^: 

sub 



, (8a) 
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Var = (^iV-V3^ . 



(8b) 



Bias-variance decomposition yields that 



E 



~ suh \ 2 



1/2 



(n-^^"" 



(8c) 



In the next section we will show that a multi-scale approach can improve this N^l'^ rate of conver- 
gence to N^l^. 



5 Upgrading the subsample estimator using a multi-scale approach 

In this section we show that using different lower frequencies for subsampling instead of one singular 
fixed and calculating a weighted mean of the different subsample estimators leads to a more 
efficient consistent estimator with a better rate of convergence N^l^. We calculate Ma? subsample 
estimators using the regular sequence i = 1, 2, 3 . . . , Mjy — 1, Mjv instead of the fixed Kjq in Section 
|4j We remark that there is no advantage using a more general sequence of subsample frequencies. We 
focus on the variance of our multi-scale estimator for the integrated covariance due to the noise terms 
first, which is the conditional variance given the paths of both efficient processes, and calculate noise- 
optimal weights that minimize this variance due to market microstructure frictions. In the following 
we skip the index N for M. The general multi-scale estimator 

rault J^^ sub i J^^ i 

1=1 1=1 j=i 

with weights will give a consistent estimator by choosing the weights optimally. 
To determine noise-optimal weights, we will impose side conditions on the weights that simplify 
the minimization problem for the variance due to noise. After that, we will prove that the variance 
due to mixed terms is asymptotically negligible and we will calculate the discretization variance. As 
for the subsample estimator there is a trade-off between variance due to noise and variance due to 
discretization. Choosing M optimally in the way that the mean square error is minimized, we will see 
that the total variance is of order N^''^^. An important fact is that the weights as well as the order of 
M and the rate of convergence of the multi-scale estimator are in line with the variance case presented 



by Zhang (2006a I. We impose the condition 

M 



Y^ai = l, (9a) 



i=l 

that ensures unbiasedness of the resulting estimator, and the auxiliary condition 

M 



i:- = 0' (9b) 
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that will guarantee that the 'leading' term in the variance equals zero, on the weights that gives 



M N 

OLi 



i=\ j=i 

M / N 



M I N N 

Y ' 

*=1 \i=o j=« 
M / N 

(9b) ^ ^ - ^ + ei._^e^^^ 

i=i y j=j 

with the remainder term 

R^ = -E4^l- E 4A. 



i~l N 

X Y 



that is asymptotically negligible in the sum because of Assumption |2] and i < M = o{N). Hence we 
only have to focus on the residual term for the analysis of the variance due to noise. Define 



N 

.X Y 



Lemma 5. The variance ofUi satisfies for all i E {1, . . . , M} the following asymptotic inequality: 

Var {Ui) < mrfxTfl . (10) 
Proof. The summands have variances Irj'j^qy- Because of Assumption [2] only covariances 

Cov((6je];_^+ef_^e^^.) , {^e^^el_^ + e^^el 

with k = j ±1 can be non-zero. Those covariances are smaller or equal than 2r/^r/y and we obtain 
the inequality by separating the variance of Ui in the sum over all variances and covariances. □ 

Under our assumptions the random variables Ui are not necessarily uncorrelated. For different 
subsample frequencies i and A: S {i — 1, ...,« + 1} the covariances 

N N 
j=i r=k 

can be non-zero. In fact, only addends with r G {j — + 1} and (j — i) = {r — k) could have 
non-zero covariance (if the considered maximum and minimum are equal) and hence correlation 
effects will be very small but in any case a mathematical analysis gives an upper bound and the exact 
asymptotic order using an inequality similar to those in the last section. 

Lemma 6. For the variance of the general multi-scale estimator due to noise the asymptotic inequality 

/ M N \ 2 

^« I E ? E a - €.) « - ^.-.) I s E t is^-'i"?- (") 

holds. 



1 £. \ aj -J — ^ / \ '-^ ••J — * / I £. J 

1=1 j=i 1 i=\ 
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Proof. Applying the Cauchy-Schwarz inequality to the covariance terms considered above and using 
inequality (fTOb we conclude that 



(M N \ ^ 2 

1=1 j=i J 1=1 

M 2 

<5^^18r?i4(iV + o(iV)) 
i=i ^ 

which gives the result of Lemma [6] □ 
Minimization with side conditions yields for an arbitrary constant c G R: 

- 1 + A2 





^ 2c^ + Ai + — = 



OLi . A2 

= u 

1 



a, = -— (i^Ai + A2i) . 
2c 

Since 1 = = - (Ai + A2 E ^) and = ^ = (Ai ^ i + A2M) we get the result 
-24c 12c 12^2 U 

-^1 — T7^ Tl ' A2 — 777 TTTT , OLi 



-M ' (M - 1)M ' ' M^-M (M - 1)M ' 



The noise-optimal weights are the same as for the MSRV-estimator invented by Zhang (2006a I for 
efficient high-frequency realized variance estimation from noisy observations. This is a positive aspect 
of using these methods because the weights are calculated once and serve for variance as well as 
covariance estimation. 
Inserting the noise-optimal weights 



12^2 6i 

in the noise-variance term above yields the result stated in the following proposition. 



«.,opt = ^-^(1 + 0(1)) (12) 



Proposition 7. If we insert the noise-optimal weights ai^optfrom ( 12 1 in the general multi-scale esti- 
mator and assume that the variances rj\, rjy of the noise distributions are of order 1, the asymptotic 
variance due to noise satisfies 



1=1 j=i 



Proof. We calculate the minimal noise-variance using the optimal weights ( [T2| ). For the occurring 
sums it suffices to use the asymptotic formula = + {l^'^^) and Lemma js] and |6] to 

deduce the asymptotic order of the variance: 



i-i i=i \ / 

12N ,9 / iV ^ 
-W^xriY + oi—, 



1=1 1=1 



□ 
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We have shown that our resulting multi-scale estimator with noise-optimal weights has a variance 
due to noise contamination of asymptotic order N/M^. Next we focus on the other terms occurring 
in the total variance. 

Under the stated assumptions the multi-scale estimator for integrated covariance is unbiased and the 
variance induced by the mixed summands is asymptotically negligible. Unbiasedness holds by Con- 
dition ( |9al ) on the weights and we focus on the variance of the mixed summands now. 
The analysis of the terms with the @jS and @jS (see Q for definition) is analogous and we only 
mention the analysis of the first term. Inserting the noise-optimal weights ( [T2] ) the variance equals 

/ M N 

Var E^E(^., -4-.) 



J=l j=i 

M 



/144i/c 36 72i 72k 
i=i k=i ^ 



N N 
j=i r=k 

The covariances can only be non-zero if the time intervals of the increments of the efficient process 
X are overlapping and because of Assumption [2] if r £ {j — 1, j, j + 1} or {r — k) = {j — i) holds. 
Therefore, we conclude that a constant C* exists such that 

i=i k=i ^ ' 
is an upper bound for the variance and we obtain the asymptotic order 1 jM for the mixed terms. We 
have deduced that the mixed terms are negligible in the asymptotic total variance and hence we focus 
next on the terms containing the (e)^s (see (|2]l) and the variance due to discretization. 
Thus the variance term of interest is 

M N 



I 

1=1 j=i 




M i N 

2 E E ^c„v( E 

1=1 k=l j=i 

N 



r=k 

Considering next the single covariance terms in the discretization-variance using 



E 



for arbitrary i, j and /, k leads to 



^'^*min(ij) "'^tmax(i-ij-fc) j ^ {iiiin (i j) >max ( j-i J- fc) } 



M N 




H J F F \C* <C** — 



I 1=1 ^ 
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with constants C* and C** . The inequaUty is deduced in the usual way analyzing which increments 
are overlapping and hence correlated and using the asymptotic orders of the increments known by 
Assumptions [T] and [3] 

Proposition 8. For the variance of the noise-optimal multi-scale estimator due to discretization the 
following asymptotic inequality holds: 



Var 



' M N 



i=l 



X, 



fM 
— 



(14) 



The discretization variance is of order M/N and we have to choose 



M = O(ViV) 



to reduce the total variance to order \/M or rather \/y N. There is a trade-off between the variance 
terms due to microstructure noise and discretization and the total variance is minimized by a choi- 
ce of M that induces both being of the same asymptotic order. Calculating M = 0(\/]V) different 
subsample estimators and calculating the weighted sum with noise-optimal weights results in obtai- 
ning an estimator with asymptotic total variance of order 1 / \/iV upgrading the rate of convergence to 
N^/"^ compared with N^/^' for the simple one scale estimator presented in Sectionjs] Although the new 
estimator requires more computing time, the new estimator gains a higher efficiency in covariance 
estimation in the case of high-frequency noisy observations. 

We present the results derived in this section again in the following proposition which implies Theo- 
rem [T] 



Proposition 9. If we choose = 0(v A^) and calculate the noise-optimal multi-scale estimator 
for the integrated covariance 



mult / 1 97 fi \ 

j=i 



,=1 .Ml, iV4 



(15) 



we obtain a consistent unbiased estimatior with asymtotic variance of order M ^ = N ^1'^: 



E 



mult 



(16a) 



— mult 

Var ( (X,y)y 







M 



The bias-variance decomposition yields that 
IE 



mult ^ 2 




1/2 



(a-V4^ . 



(16b) 



(16c) 



We will prove the rate of convergence A^^/* to be optimal in the following section. 

Remark 10. We suppose that an extension of Proposition ^^or non-i. i. d. noise is possible under a 



milder assumption of exponentially decreasing mixing coefficients such that the equations { 16a l-( 16c I 



still hold. This extension for the one-dimensional case has been developed in Ait-Sahalia et al.\ ( \2005^ . 
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6 A lower bound for the rate of convergence 



In the following we show the LAN (local asymptotic normality) property for a constant correlation 
coefficient p = corr{B'^ ,B^) of the two Brownian motions of X and Y with rate N^^^* within 
the following simplified model and conclude the rate-optimality of our estimator defined in the last 
chapter. 

We have the observations: 

Jo 

Yt^= r dBj + el i = 0,...,iV. 

JO 

We restrict ourselves to synchronous observations and equidistant time intervals Aij = At = 
Furthermore we assume the discrete noise processes to be independent of the efficient processes 
and independent to each other (as in Assumption[2]before). We strengthen the i.i.d. assumption for the 
noise to an i.i.d. -Gaussian assumption: 

.r~m^l),^^~^(0,r/|-),^ = 0,...,iV. 



We want to estimate the parameter p from observed increments (AX^^ , . . . , AX^^ , Alt^ , . . . , l^Yt^ ) 
taking values in a measurable space {^2N, 3^2n) with law . Local asymptotic normality with rate 
A^"^/" means that for a real sequence ^ h the sequence of log-likelihoods converges in law to a 
limit of the following form: 



log 



i2N 
p+N-^hN 



hz^/T{p) 



h'Hp) 



with Z ~ 3sf(0, 1) and I{p) denoting the Fisher information. Then the limit distribution of a sequence 
of estimators p2N is under regularity conditions the convolution of a Gaussian distribution and a noise 
factor. The maximum risk of any estimator is bounded below by the Gaussian risk and the minimax 



theorem gives the result on how well the parameter can be estimated asymptotically. See e. g. van der 



Vaart (1998 1 and van der Vaart and Wellner (19961 for further information on LAN and optimal 
convergence rates. 



We summarize the results of this section in the following Proposition 1 1 



Proposition 11. In the simple model of two synchronously equidistantly observed standard Brownian 
motions X and Y with constant correlation p and an observation noise described by i.i.d. Gaussian 
errors with standard deviations rjx and rjY the LAN property with ^/'^-rate holds , where N denotes 
the number of observations in the interval [0, 1]. Assuming without loss of generality r]x > rfy, we 
obtain the following lower and upper bound for the asymptotic Fisher information: 

<m<4- 



1 



1 



+ 



1 



Sj]x\{l + pf/^ {l-pf^ 



1 



1 



+ 



1 



{l+pfl- (l_p)3/: 



(17) 



Particularly assuming the variance of both noise processes to be equal {rjx 
late the exact asymptotic Fisher information. It is given by 

— I ^ 1 



I{p) 



TTjY = ffj we can calcu- 



(18) 
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Proposition [TT] implies Theroem[2]and gives, furthermore, bounds for the asymptotic Fisher infor- 
mation. 



Remark 12. We prove the LAN property with rate N~^/* in this simplified model and thus the op- 
timality of our multi-scale estimator. It has the optimal rate of convergence even in the synchronous 
equidistant case. The asymptotic Fisher information is enclosed between the 'natural' lower and an 
intuitive upper bound. We state that the Fisher information ( 1 8 1 has the following asymptotic beha- 
viour: 

I (p) ^ oo for p ^ ±1 and I (p) ^ for 7] ^ oo . 

Proof. First we will prove the LAN property for the simpler case of equal noise variances r]x = 
rjY = rj and calculate the asymptotic Fisher information ( fTS] ). We want to derive the distribution of 
the increments 



dBf + el 



and AYt^= [' dBj + el - el_^ . 



The constant correlation parameter is denoted by 9 in the following. There exists a Brownian motion 
B independent of X such that the following equation holds: 



AYt^ = r 9dBf + r dBt + el 



Y 



Taking this into account we can easily calculate the covariations of the increments 

Cov(AXt,, AX. ) = Cov(Ayt^, Ay. ) : 



At + 2rf if 
— rf' if|i 
if |i 



I = J 
j| =1 
j| > 1 



Cov(AXt^,Ay., 



OAt if i=j 
if i^j 



The random vector {AXt^ , . . . , AXt^ , AYt^ , . . . , AYt^ Y has a 2 x 2 dimensional covariance 
matrix 

An Dn 
Dn An 



with the N X N diagonal matrix 



D 



( OAt 

■•■ 



N 



\ 



\ ... OAt I 
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and the N X N tridiagonal 1-Toeplitz matrix 



A 



N 



( M + 2rf -yf 

■•• ■•• 



V 







-rj^ M + 2rf ) 



This special structure of the covariance matrix makes it possible to explicitly compute the eigenvalues 
of Sg. Here the fact that we assumed the variances of both noise processes to be equal plays an 
important role. 

We write the AT-dimensional identity matrix as 1 jv- Then the characteristic polynomial of S^t can be 
written as 

dct (Se - Al27v) = (dot [An - Mn)? - {eAtf^ . 
Using a Laplace-expansion, the characteristic polynomials of Aj^ can be computed by a recursion: 

det {An-X1n)= (At + 2ri'^ - A) det (^jv-i - Al jv-i) + (r/^) ^ det {An-2 - Aliv-2) 
= E(-l)'=r:') (At + 2,^-A)"--(,^)-. 

k=0 ^ ^ 



The eigenvalues of ^jv are Aj jv = At + 2r/^ ( 1 — cos 



N+i ; ' 



1 , . . . , A^, and because of the 



simple structure of we can deduce the 2N eigenvalues of the covariance matrix directly: 



X+{9) = At{l + + 277^ 1 _ cos 



ITT 



A-^(0) = Ai(l - 0) + 27/2 ( 1 - cos 



With the notation 



Xj,2N{0) 



N + 1 

iir 
N + 1 



A+^ ifi = 2^-l, 1 = 1, 
Kn if i = 2? , « = 1, 



i = l,. 
1 = 1,. 



,N. 
,N 



,N 
,N 



(19a) 
(19b) 



(19c) 



we can write the 2N x 2N diagonal matrix of the eigenvalues as A^^ with (A^^)jj = Xj,2N{G)- ^0 
can be diagonalized by an 2N x 2N orthogonal matrix P^^ which is independent of 0. The random 
vector • {AXt-^ , . . . , AX^^ , Alt^ , . . . , Alj^ )* is centered Gaussian with covariance matrix A^-^. 
We define the 2A/^-dimensional random vector by 



V-^j,2Jv(p) 



(P^A' . ( AXi, , . . . , AXt^ ,AYt„..., AYtJ) 



J< 



Ai,27v(p) 
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To prove the LAN property we have to examine the log-Ukehhood 

Tl2N 



log 



log 



j)±N_jlh_N 
dP2^ 



2N 



^j,2N{p) 



>^j,2N [P + N ihN 



j=i \ 



2^ 
J ^2N + 1 



where 



7. 



2N Aj, 2jV (p + N-'/'^hN) _ ^ ^ At ■ N-y^hN 
^j,2N{p) ^j,2N{p) 



The proof is now analogous to the one dimensional case (see Gloter and Jacod (2001 1) and using 
Theorem Vlll-3.32 in Jacod and Shiryaev] (2003 1 it remains to show that 

2N 

(20) 



sup |7f 1-0 and ^ {jff ^ 2h'l{p) 



^<j<2N 



i=i 



The first condition is obviously fulfilled. To prove the second one we write the sum of the squares as 
a Riemann sum and use an inequality including the corresponding integral: 



2N 



N 



j=i i=i ( 1 + p + ^ 



N 



2yf- / -I ITT 

—'— ' 1 — cos 



TV+l 



i=i 1 - p + 



At I ^ COS j^j^^ 



N^^hlf {Aty 



vr 



N 

E 



At(i+p) y 



=5jv 



For the integral 



and accordingly 



J 



J 



1 



2(l-cosz) + M3^)' 



1 



2(1 -cosz) + 



At(l-p) 



the following inequalities with the lower and upper Darboux sums hold: 



N 



1 



^^f2fl-cos(^U^V 



AT 

< J < > 

- AT 



/=i 2 1 - cos 



TTT -\ 771 
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and accordingly 

N 



TT 
TV 



E 



(2(l-cosMlt) + MW0y 



N 

<j<^y 



(2(l-cos^) + «f^l)- 



Thus the following inequalities hold for the Riemann sums Sn and Sn, respectively: 

^ ^ TT 1 TT 1 

J <Sn <J + 



and 



J < 5jv < J + 



AT 



TT 



N{4+^y ^(2(l-cos^) + ^^±|^)' 



TT 



^iv(4 + MW))^ ^(2(l-cos^) + ^*±M) 



2 ■ 



The integrals can be computed expUcitly: 



in 



2^2, 



TT 2 + 



At(l+p) 



7r(2+MW) 



y2 



Since Hn ^ h, v/e can deduce from the preceding inequalities for both summands the convergence 



2N ,2 / 



4r? \{l + pp' 



= 2^27(^) 



(21) 



with the Fisher information 



We continue the proof with the generalization for different noise variances. If the noise variances are 
not equal Vx Vy' covariance matrix can be written as 



An Dn 



with the same diagonal matrix Dn as before and two tridiagonal 1-Toephtz matrices An and Bn with 
the same structure as before where An has the entries At + 2r]\ on the main diagonal and correspon- 
dingly, Bn the entries At + 2r7y. The eigenvalues of and Bn have been deduced before and are 
denoted by A^'^^ and Xy^^ here, which emphasizes the dependence on -qx and ryy, respectively. Be- 
cause of the special structure of An and Bn, that are in particular symmetric and commutative, they 
share the same eingenvectors Vi,i = 1,. . . ,N. We can calculate the 2N eigenvalues of T^q, denoted 

U) (i) 

by ,i = 1, . . . ,N, using the approach 

y ^ f An Dn \ f avi \ ^ f avi 
^'^-[Dn Bn '\pvi ~^\l5vi 
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for the eigenvectors with constants a and (3. This equation imphes that 

aAt9+ (3\^'^^ = f3^ , 
and by solving this system of equations we obtain the 2N eigenvalues 



Ax + '^Y 



+ 



+ ^2 (Aty 



\ 



~ '^Y 



+ {Aty 



We have dropped the index N of the eigenvalues here. 

Lemma 13. If we assume tjx > ??y, the following inequalities hold: 



^S-±^ + eAt < < A« + eAt , 



- 9 At < < 



^-dAt. 



(23a) 
(23b) 



Proo/ If rjx > rjY for the eigenvalues A^'* > Ay'' holds for alH G {1, . . . , A'^}. Thus 



+ 



holds and analogously the lower bound for is obtained by adding the mixed term to the expression 
under the square root. The other bounds are obvious. □ 

In the following we define 



7+ 



> and 



(,) ^ _ 1 



7- 



<0 



in analogy to the case of equal noise variances. We use the preceding lemma to obtain bounds for 
these coefficients and show the LAN property with the same rate AT^V* as above, including bounds 
for the Fisher information. 



Proposition 14. Ifr^x > Vy the following inequalities hold: 



N-lhNAt+ _ ^ N-lhNAt 



X 



X 



pAt 



<r+ < 



+ pAt 



and 



-N-ihNAt ^ (i) ^ 
— < 71 < 



-N-ihNAt+ y 



pAt 



pAt 



(24a) 



(24b) 
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Proof. Using the inequality ( 23a i in the preceding Lemma [l3| we obtain the lower bound for 
From 



(0 
74- 



2 



_|_ 



< 



x(i), 3,(0 

2 "I" 



+ {pAty + N-ih^At 



N'^HnAI ^ N'^ih^At 



if{p) 



\(») 1 1,(0 



we can deduce the upper bound using again the right-hand side of inequality (23a I in the last inequa 



lity. The bounds for 71 follow analogously. 



□ 



Now we are able to prove the LAN property in the same way as for the case of equal noise 



variances using the preceding inequalities. Because of Proposition 14 the inequalities 



AT 



N 



N 



1=1 



N-2h% {Aty 

^2 +pAt 



+ 



N-2hi {Aty 



and 



TV 



9 ^ 9 



i=l 



i=l 



N 

>E 

i=l 



/in 9 /a'*) 



N-thl {Atf+[hr^ 



> 



N 

E 



+ pAt 



+ 



a}^ - pAt 



N-^h%{Atf N-^h%{At) 



+ 



\^(A«+pAt)' (A«-pAt)' 



hold. In the lower bound the mixed terms drop out. 

Using those inequalities, the proof reduces to the method used before for the equal noise variance 
case where we found that (Riemann) sums of this type can be approximated by integrals. We just 
have to do this calculation twice for the upper and the lower bound changing only the constants 
in the denominator of the integrated function and obtain the convergence to 2h?I{p) and 2h?I{p), 



respectively, with the lower and upper bound for I {p) stated in ( 17 1 



□ 
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Remark 15. Although the inequalities appearing in the proof for the case of different noise variances 
are strict, the asymptotic results do not yield the strict inequalities in for the lower and upper 
bound for the asymptotic Fisher information. We suppose that the strict inequalities also hold and a 
numerical approximation for the Riemann sums using different special values indicated this too. 



7 Simulation Results 

In this section we compare the simulation results for the subsampling and the multi-scale estimator. 
A detailed comparison of the finite sample size performance of the subsampling and the Hayashi- 
Yoshida estimator is given in Palandri| ( 2006| l. We have shown that the multi-scale estimator is 



asymptotically more efficient which means that the rate of convergence is A^^/* compared to N^/^' for 
the subsampling estimator. The following simulations investigate the behaviour of both estimators for 
finite sample sizes. 

To generate asynchronous observation times for the processes X and Y we take ti,i = 1 , . . . , n and 
Tj, J = 1, . . . , m as arrival times of two independent Poisson processes such that At, ~ Exp(i9x) and 
Atj ~ Exp(??y). In our notation this means E [Afj] = {}x and E [Atj] = -dy. We set T = 1 and the 
means of the time increments between observations equally to -dx = = 1 / 30000. The expected 
number of observations for both processes is about the number of seconds during one trading day 
and thus a typically high-frequency observation scheme. The sets of observations 0''^ and 0^ almost 
surely have no intersection points, but all time increments are of order in probability (and thus 
all assumptions imposed in the sections before are guaranteed). 



Remark 16. In this special case, where the number of observations n and m follow independent 
Poisson distributions with parameters 1 / i?x and 1 /i?y, we can prove that our synchronization method 
creates N synchronized observations with EiV = 1 /■!? where 

^ = ^X+^Y ^^^^ 



'&X+'&Y 

For'&x = ^y{= 1/30000) we obtain i? = (3/2)i?x and¥.N = (2/3)t?^^(= 20000). 

For our simulations we use constant parameters ax = cry = 1 and /> G [—1,1] and neglect drift 
terms. The increments of the efficient processes are then given by 



ti-i 



and 



A% = r dBY = r pdBf + vi-p^ r dBt 



where Bt is a standard Brownian motion independent of X. Therefore, we simulate values of X for 
all observation times in 0^ U 0^ and simulate the observations of Y using the equation above. For 
the dicrete noise processes we assume ~ ^(0, r/^) and ~ ^{0, Vy)- 

To calculate the subsampling and the multi-scale estimators we first have to determine the number 
of subsamples Kj\[ and the number of frequencies Mn, respectively. We know that a choice Kx = 
Csub^^^^ and Mx = Cmuiu^^^'^ ■> respectively, with constants c^nfe and Cmuith respectively, minimizes 
the resulting mean square errors of the estimators. For our simulations we can calculate the optimal 
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noise level rj'j^ = riy 


Kn 


Mn 




(1/^) 


646 


216 




0.1 


300 


122 




(l/\/lO)-0.1 


139 


68 




0.01 


65 


38 


P ^ 


(1/Vl0)-0.01 


30 


22 




0.001 


14 


12 




(1/^/10) -0.001 


6 


7 




0.0001 


3 


4 





Multi-scale Subsampling 

Table 1: Calculated values for and for different noise levels 1]\ = r/y and parameters 
= = 1. 

Figure 2: Boxplot for Multi-scale and subsampling estimator for /? = 0.5 when ri\ = riy = "rf = 
\/0J. 

constants because we know all the parameters. Inserting Kn = Cgub^'^^^ and = CmuiuN^^'^, 
respectively, in the asymptotic variances of the estimators, minimization yields 



Csub,opt — \/ '^^'x^Y ' ^mulU,opt 



36-35 
52 



If one uses the estimators for data of two asset prices the parameters are unknown and one can cal- 
culate the optimal constants above in the same way, but they will depend on the variances of the noise 
processes and the quarticities of both underlying Ito processes. Then one can estimate the constants 
by using adequate estimators for these parameters as proposed in Zhang (2006a) for example. 

Figure [2] shows a boxplot for 1000 Monte Carlo iterations for large noise variances 

= Vy ~ ''l'^ ~ \/0^ that exemplifies a higher efficiency of our proposed multi-scale esti- 
mator compared to the subsampling estimator at least when microstructure noise effects are large. 
Next we present a comparison of the resulting root mean square errors (RMSE) for different noise 
levels. The results are illustrated in Figure |3] The RMSEs are calculated for each noise level 
based on 1000 Monte Carlo iterations. Our simulations show that for very noisy data (noise level 

= 7/y = r/^ > 0.01) and KN = 30000 expected observations for both processes, the multi-scale 
estimator has a significant smaller root mean square error compared to the subsampling estimator. 
The ratio of both RMSEs is increasing when the noise level decreases in the range 0.1 > ry^. For 
small noise levels and same (expected) sample sizes the multi-scale estimator also has a smaller 
RMSE but the ratio of the RMSEs gets close to 1 and fluctuates for different (small) noise levels. 
Our simulations thus confirm that for not negligible market microstructure frictions our proposed 
multi-scale estimator for the quadratic covariation of two Ito processes performs better than the 
subsampling (and of course the HY-estimator) not only asymptotically but also in the case of typical 
sample sizes (for high-frequent intraday stock data). 
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Figure 3: Root mean square errors of subsampling and multi-scale estimator for different noise levels 



We have chosen the ranges for the noise variances such that for the illustrated values the noise 
variances decrease with a factor 1 / \fWi. Because of the factor r/^?7y in the variances of the estimators 
due to noise the root mean square error (disregarding the discretization error) should decrease linearly. 
This can be seen regarding the values in Figure|3]for large noise variances when the error due to noise 
dominates the error due to discretization whereas the influence of the discretization error is stronger 
for small noise levels. 

In Figure |4] the root mean square errors of both estimators are diagrammed for different constant 
parameter values of the correlation p = /c/10 , = 0, . . . , 10 when i] = 0.01 based on 200 Monte 
Carlo iterations for each value. For all eleven parameter values the multi-scale estimator has a smaller 
root mean square error although the differences are not very large for this (small) noise level. We 
can announce the increasing root mean square errors when p increases with the dependence of the 
discretization error on p. Although we did not state precisely a formula for the asymptotic variance 
due to discretization it is natural that the variance analyzed in Section [5] grows for higher values of 
p. For this noise level the discretization error is influential enough to cause the different root mean 
square errors illustrated in Figure |4] 
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Figure 4: Root mean square errors of subsampling and multi-scale estimator for a constant noise level 
i]\ = = rf' = 0.01 and different correlations. 



8 Conclusion 

We have presented and compared three estimators for the quadratic covariation of two Ito processes. 
If we have discrete asynchronous observations without market microstructure noise, the Hayashi- 
Yoshida estimator is a consistent estimator solving the problem of asynchronicity. However, when 
microstructure frictions are relevant the estimator is not consistent any more, as we stated in Pro- 
position [T] If we deal with noisy asynchronous data, we have to synchronize the observations first. 
We used the method presented by Palandri| ( |2006 1 in Section [3] to rearrange the observations in an 



adequate way. We have shown in Section |4] that a subsampling approach yields a consistent estimator 
with A^^/^-rate of convergence, where N denotes the number of synchronized observations. In Secti- 
on [s] we introduced our multi-scale estimator which gains a higher efficiency and has a A^^/^-rate of 
convergence which is stated in Theorem [T] This rate is optimal what we have proved in a simplified 
model even for the synchronous case by giving a lower bound for the rate of convergence using the 
LAN property with rate N^^/'^ in Section |6] Proposition [Tl] comprises this result and the asymptotic 
Fisher information. Simulations show that the multi-scale estimator performs better compared to the 
subsampling estimator if the noise level is high enough and the sample size is not too small. 
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